SemanticScuttle - klotz.me » klotz: machine learning+python

klotz: machine learning* + python*

I Was Wrong: Start Simple, Then Move to More Complex

The author discusses a shift in approach to clustering mixed data, advocating for starting with the simpler Gower distance metric before resorting to more complex embedding techniques like UMAP. They introduce 'Gower Express', an optimized and accelerated implementation of Gower.

2025-09-05 Tags: clustering, data science, machine learning, gower distance, umap, gower express, mixed data, python, scikit-learn, data analysis, shrunk by klotz

A Visual Guide to Tuning Random Forest Hyperparameters

This article explores the impact of hyperparameters on random forests, both in terms of performance and visual representation. It compares the performance of a default random forest with tuned decision trees and examines the effects of various hyperparameters like `n_estimators`, `max_depth`, and `ccp_alpha` using visualizations of individual trees, predictions, and errors.

2025-09-05 Tags: data science, machine learning, random forests, hyperparameter tuning, python, data visualization, scikit-learn, decision trees, james gibbins by klotz

Google Launched LangExtract, a Python Library for Structured Data Extraction from Unstructured Text

Google has introduced LangExtract, an open-source Python library designed to help developers extract structured information from unstructured text using large language models such as the Gemini models. The library simplifies the process of converting free-form text into structured data, offering features like controlled generation, text chunking, parallel processing, and integration with various LLMs.

2025-08-09 Tags: machine learning, data engineering, python, google, langextract, llm, gemini, information extraction, e by klotz

Namers - Turftopic

This page details the topic namers available in Turftopic, allowing automated assignment of human-readable names to topics. It covers Large Language Models (local and OpenAI), N-gram patterns, and provides API references for the `TopicNamer`, `LLMTopicNamer`, `OpenAITopicNamer`, and `NgramTopicNamer` classes.

2025-07-15 Tags: topic modeling, llm, openai, n-grams, turftopic, python, machine learning, text analysis, classification, solon by klotz

Topic Model Labelling with LLMs

Python tutorial for reproducible labeling of cutting-edge topic models with GPT4-o-mini. The article details training a FASTopic model and labeling its results using GPT-4.0 mini, emphasizing reproducibility and control over the labeling process.

2025-07-15 Tags: llm, machine learning, nlp, python, topic modeling, fastopic, turftopic, gpt-4, classification by klotz

Hands-On Attention Mechanism for Time Series Classification, with Python

This article demonstrates how to use the attention mechanism in a time series classification framework, specifically for classifying normal sine waves versus 'modified' (flattened) sine waves. It details the data generation, model implementation (using a bidirectional LSTM with attention), and results, achieving high accuracy.

2025-06-01 Tags: deep learning, python, time series, transformers, attention, lstm, classification, production engineering, observability by klotz

Data-Science-Espresso/Reinforcement-Learning-TicTacToe

This is a GitHub repository for a Reinforcement Learning Tic Tac Toe project. It contains a single Python file, TicTacToeRL.py. The repository has 0 stars and 0 forks as of the current data.

2025-05-28 Tags: reinforcement learning, tic tac toe, python, github, machine learning, q learning by klotz

Python Pandas Ditches NumPy for Speedier PyArrow

Pandas 3.0 will significantly boost performance by replacing NumPy with PyArrow as its default engine, enabling faster loading and reading of columnar data.

2025-05-27 Tags: python, pandas, numpy, pyarrow, data analysis, performance, machine learning by klotz

How To Automate SEO Keyword Clustering By Search Intent With Python

This practical guide uses SERP comparisons and Python to group keywords by intent, faster and more intuitively.

2025-05-19 Tags: keyword clustering, search intent, python, serp, automation, machine learning by klotz

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

PaperCoder is a multi-agent LLM system that transforms scientific papers into code repositories through a three-stage pipeline: planning, analysis, and code generation. It aims to create faithful, high-quality implementations.

2025-04-26 Tags: paper2code, llm, code generation, machine learning, papercoder, ai, python, openai, scientific papers by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: machine learning* + python*

Linked Tags

Related Tags